Selection of sublexical units for continuous speech recognition of basque

نویسندگان

  • Karmele López de Ipiña
  • M. Inés Torres
  • Lourdes Oñederra
  • Amparo Varona
  • Luis Javier Rodríguez-Fuentes
چکیده

This paper describes the work carried out to select the most suitable set of Sublexical Units for Continuous Speech Recognition of Basque. Even if there are several dialects in Basque, only one of them has been used to choose the preliminary set of sounds. Bearing in mind this aim, a wide experimentation has been carried out to select Context Independent Phone-Like Units. Then, in order to obtain robust acoustic models for the language, the units have been evaluated with most of the dialectal variants of Basque. Finally, Decision-Trees based Context Dependent Sublexical Units are selected. For building the trees the classical methodology of Bahl and the efficient Growing and Pruning algorithm have been used.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decision Tree-Based Context Dependent Sublexical Units for Continuous Speech Recognition of Basque

This paper presents a new methodology, based on the classical decision trees, to get a suitable set of context dependent sublexical units for Basque Continuous Speech Recognition (CSR). The original method proposed by Bahl [1] was applied as the benchmark. Then two new features were added: a data massaging to emphasise the data and a fast and efficient Growing and Pruning algorithm for DT const...

متن کامل

Automatic Morphological Segmentation for Continuous Speech Recognition of Basque

The selection of appropriate Lexical Units (LUs) is an important issue in the development of Continuous Speech Recognition (CSR) systems. Word has been used classically as unit in most of them. However, proposals of non-word units have begun to arise. Since the subject of this study is the Basque language, which is an agglutinative language with a complex structure inside words, non-word units ...

متن کامل

First approach to the selection of lexical units for continuous speech recognition of Basque

The selection of appropriated Lexical Units is an important issue in the Language Model (LM) generation. Word has been used classically as unit in most of the Continuous Speech Recognition systems. However, during the last years proposals of non-word units have begun to appear. Since Basque is an agglutinative language with a certain structure inside the word, the nonword units could be an adeq...

متن کامل

Selection of Lexical Units for Continuous Speech Recognition of Basque

The selection of appropriate Lexical Units (LUs) is an important issue in the development of Continuous Speech Recognition (CSR) systems. Words have been used classically as the recognition unit in most of them. However, proposals of nonword units are beginning to arise. Basque is an agglutinative language with some structure inside words, for which non-word morpheme like units could be an appr...

متن کامل

Lexical and Sublexical Units in Speech Perception

Saffran, Newport, and Aslin (1996a) found that human infants are sensitive to statistical regularities corresponding to lexical units when hearing an artificial spoken language. Two sorts of segmentation strategies have been proposed to account for this early word-segmentation ability: bracketing strategies, in which infants are assumed to insert boundaries into continuous speech, and clusterin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000